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Abstract — Bit-interleaved coded modulation (BICM) uses long 
interleavers, that, at the receiver, require large memory that 
grows with the cardinality of the constellation. This large amount 
of memory usually has a significant impact on the receiver cost. 
Therefore it is crucial to minimize the de-interleaved memory, 
thus allowing for only a negligible loss in the performance. The 
de-interleaver can operate either on quantized noisy symbols or 
on a group of quantized LLRs. The latter solution is usually 
the more convenient because the LLRs can be quantized and 
compressed in a more efficient way with respect to the noisy 
samples. Using this receiver architecture, we present an LLR 
quantization and compression procedure optimized in order to 
maximize the system's generalized mutual information (GMI) 
under a constraint on the maximum number of bits of the com- 
pressed LLR word. We take into consideration a typical digital 
video broadcasting over cable (DVB-C2) scenario. Numerical 
results show that the proposed solution allows for a memory 
saving up to 30% at the expense of the additional complexity 
due to compressing procedure. 



L Introduction 

Bit-interleaved coded modulation (BICM) is an effective 
technique for achieving high communication rates by encoding 
data bits, by interleaving the encoded bits, by and then 
mapping bits into symbols. In order to increase the spectral 
efficiency, large symbol constellations can be used. For ex- 
ample, for the second generation digital video broadcasting 
standard of cable transmission (DVB-C2) [l], the constellation 
size is up to 4096 points and the symbol interleaver is up 
to 51776 symbols long; its wireless counterpart, the DVB- 
T2 (|2l, uses a constellation of a size up to 256 points, with 
a time interleaver that can contain up to 1023 forward error 
correction (FEC) codewords; and the Homeplug-AV standard 
(|3j for communication over powerline uses a constellation of 
a size up to 1 024. At the receiver, symbol de-interleaving is 
usually first performed on the demodulated samples, followed 
by demapping that provides the log-likelihood ratio (LLR) for 
each encoded bit and then by bit de-interleaving before FEC 
decoding [4|. With large symbol interleavers, these operations 
require a large amount of memory that has an impact on the 
cost and on the area of a single-chip receiver. One solution 
consists in a compact representation of the LLR, which can be 
obtained by quantization and compression of this information. 
Note that both the quantization and the compression of an LLR 
have been investigated to reduce the memory occupation of 
systems employing hybrid automatic repeat request (HARQ) 



f5\, where multiple versions of the same packets must be 
stored. Moreover, LLR compression is used also in compress 
and forward systems ^ and their application to multicell 
processing |7|, |8j. 

The mutual information (MI) between the transmitted data 
bits and the compressed words provides a good approximation 
of what can be achieved with practical FEC schemes, and its 
maximization can be considered a design criterion for LLR 
quantization and compression. As LLRs associated to bits 
that have been mapped to the same symbol are correlated (as 
affected by the same noise sample), joint quantization and 
compression of groups of bits can yield higher MI. For exam- 
ple, Danieli et al. proposed applying vector quantization to the 
LLR Is], however, this solution becomes infeasible as the size 
of the constellation gets larger, and other approaches have been 
proposed. For a BPSK transmission over the additive white 
Gaussian (AWGN) channel, the non-uniform LLR quantizer 
that maximizes the MI is derived by Rave in |9|. By observing 
that the quantized values are not uniformly distributed. Rave 
suggested applying entropy coding in order to further reduce 
the storage requirements. A suboptimal approach, where MI is 
maximized under the constraint that all quantized values have 
the same probability, has been considered in ifTOl . where the 
analysis is carried out for BPSK transmissions over a Rayleigh 
fading channel. Indeed, LLR compression is a crucial task in 
modern communication chips, especially when large blocks 
of soft bits must be handled, as for low density parity check 
(LDPC) codes ifTTl . 

In this, paper we propose a quantization and compression 
technique for LLR in systems that use large constellations. 
We focus, in particular, on the DVB-C2 system, where the 
transmitter symbols are interleaved before being mapped on 
different carriers of multiple orthogonal frequency division 
multiplexing (OFDM) blocks. At the receiver, the samples 
must be de-interleaved and demapped. In order to reduce the 
memory occupation, we propose first to demap the received 
signal and then to perform de-interleaving on groups of LLRs 
(corresponding to the data symbols). In order to ease de- 
interleaving, the total number of bits representing all the LLRs 
associated with a single symbol are fixed. In this manner, the 
symbol de-interleaver moves memory blocks of the same size. 

To design both the quantization and the compression, we use 
the generalized mutual information (GMI) fill, HD, JUJ that 
provides the achievable throughput, taking into account the 
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Fig. 1. Transmitter and channel models. 

approximation occurred in computing the quantized LLRs. Our 
first contribution is the LLR quantizer design that maximizes 
the GMI for a given total number of quantization bits among 
all LLRs. We not only adapt the quantization levels but 
also optimize the number of bits used for the representation 
of the LLR of each bit of the constellation. Our second 
contribution stems from the observation that quantized LLRs 
are not uniformly distributed. Therefore, we propose a lossy 
compression procedure of the quantized LLRs. We begin 
from an Huffman representation of the quantized LLR. We 
gather the quantized LLRs associated with a symbol with a 
word. If this word is longer than a given number of bits, the 
compressor replaces some quantized values with others that 
have a shorter representation. We optimize the compressor 
in terms of maximum GMI under the constraint on the total 
length. This is a multidimensional multiple-choice knapsack 
(MMCK) problem |15|, for which we derive a suboptimal but 
practical solution. Finally we present the numerical results for 
typical DVB-C2 scenarios. 

The rest of the paper is organized as follows. In Section 
II, we describe the system model and introduce the receiver 
architecture. In Section III, we provide the details of the 
design of the quantizer We describe the lossy compression 
technique in Section IV. In Section V, we present and discuss 
numerical results, comparing the various options introduced 
in the previous section. Lastly we draw some conclusions in 
Section VI. 

II. System Model 

We consider the transmission scheme of Figure [T] where 
data bits are encoded by FEC. Bit-interleaving (BIN) and 
Mapping (MAP) of bits to M -QAM symbols follow. Encoded 
bits are indicated as bkj G {0,l},wher4!lfc 1, 2, . . . , log A/, 
and j is the index of the QAM symbol Sj. The generated 
symbols are then interleaved (SIN) before transmission. Let 
i — M.{j) be the map of the symbol interleaver, i.e., the index 
in the interleaved vector corresponding to input symbol j. 

Symbol Si is transmitted on a fading channel, i.e., it 
is multiplied by the channel gain hi. Then complex white 
Gaussian noise (AWGN) rii is added. The noise has zero mean 
and power cr^. With this model, we appropriately describe 
the main features of many communication systems, including 
those based on OFDivjj. Single carrier transmissions with 
linear equalization, as well as MIMO systems with linear 
receivers can be cast into this model. Hereafter, we assume 
that the channel gains hi are known to the receiver 

'in this paper log(a;) denotes the base-2 logarithm of x, and \i\{x) denotes 
the base-e logarithm. 

^If the cyclic prefix is longer than the channel impulse response and assum- 
ing perfect synchronization, the cascade of OFDM modulation, the channel, 
and OFDM demodulation is equivalent to a set of parallel memoryless fading 
channels, each with a different gain hi. 



A. Receiver Implementation 

We consider the two receiver alternatives depicted in Figure 

m 

Conventional Receiver: In this receiver - depicted in 
Figure |2]a - the received samples rj and the channel gains hi 
are first de-interleaved (SDI) and then passed to the demapper 
(DEM) to obtain the LLR Afe j associated with the encoded 
bit 6fe j. For an implementation of the receiver on a chip, the 
received samples, channel gains and LLRs will be represented 
as quantized values; and in particular, quantization is explicitly 
shown in the figure by block QUA. The quantized LLRs are 
passed to a bit de-interleaver (BDI) and then to the FEC 
decoder (DEC) for error correction. In this implementation two 
blocks of memory, named and Mbd, are needed. Mg^ 
is associated with SDI and stores both the received samples 
and channel gains. A/bd, which is associated with BDI, stores 
LLRs. 

Proposed Receiver: In this receiver - depicted in Figure 
|2]b - in order to reduce both the complexity of the interleaver 
and the total memory, demapping and symbol de-interleaving 
are swapped. The received sample is first demapped to 
obtain LLRs associated to the encoded bit bk,i, for 

k — 1, 2, . . . , log Af. LLR Afc i is quantized into one of 
the Lfe possible quantization levels, and then the index Vk^j 
of the quantized level associated to the LLR is stored. In 
this implementation the symbol de-interleaver operates on 
words of quantized LLRs instead of on the quantized received 
samples. Each word consists of LLRs of bits mapped to the 
same symbol. We compress the quantized LLR values, thus 
obtaining a smaller memory and reducing memory swapping 
operations for the de-interleaver. In particular, we consider two 
components that perform compression (COM) and decompres- 
sion (UCOM) of the quantized levels v^.j. We observe that a 
simple implementation of the de-interleaver requires that all 
compressed words are represented by the same number of bits 
N . In this case, de-interleaving boils down to the permutation 
of blocks of memory of the same size. In order to ensure that 
compression generates at most N bits for each transmitted 
symbol, we allow for losses, i.e., quantized indices Vk.j could 
be substituted by other indices Vk,j represented by fewer bits. 
After symbol de-interleaving, LLR words are uncompressed 
into fixed-length quantization indices Vk,i to allow for bit-de- 
interleaving (BDI), and they are finally mapped into quantized 
LLR values X^.i before being passed to the FEC decoder Note 
that also in this case we need two blocks of the memory, A/gp, 
A/bd- Both of them store LLR quantized levels. 

The two architectures of Figure |2] can be compared in two 
respects: from a complexity and a memory point of view. In 
terms of complexity, the proposed implementation requires 
additional efforts for the compression/decompression, and it 
simplifies symbol de-interleaving as it operates on smaller 
blocks. From the memory point of view, both schemes require 
memory before the interleavers (indicated by the dashed blocks 
in Figure |2]l. If the number of bits required to store the LLR 
is less than that required to store the received symbols, the 
proposed architecture is more efficient. In Section IV-CI we 
will compare the memory required by the two architectures in 
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(a) Conventional Receiver. 
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Fig. 2. Receiver architectures. 



I 

. - 4. - , 

I , I 
I ^4'd I 
I I 

(b) Proposed Receiver. 



. - i - , 

I ^-^BD I 



more detail. 

B. LLR Statistics 

The LLR is defined as 
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{exact) 



In 



/ Prob(r,|6fc,, = 1) 
^Prob(r,|6fe,. = 0) 



(1) 



Assuming equal probability for all constellation points, the 
minimum distance approximation of the LLR is given by ||T6l 



. [appro 



mm 

Si65fc(0) 



min {||ri - /ijSilp} 



(2) 



where Sk{u) is the set of constellation points with the fc-th 
bit equal to u e {0, 1}. As both LLR computation and com- 
pression operate at a symbol level, unless explicitly required 
in the following, we drop the symbol index i or j. In this 
paper, we consider the minimum distance approximation of 
the LLrH, although the proposed solution applies also to other 
approximations of the LLR (including its exact definition). In 
conjunction with Gray mapping, the real and imaginary parts 
can be treated independently \YT\. For example, for 4-PAM 
(or for each part of a 16-QAM) with distance ^ among the 
points at the transmitter, the LLR of the most significant bit 
is 

-2^ < r < 2^ (3) 
20 r>-2^. 



A2 = 



4^e(-2r- 



Within each interval, for a given channel value h and a given 
transmit symbol s, the LLR has a Gaussian distribution due to 
the noise term. As the same value of Afc can be achieved with 
various values of r, the LLRs conditioned on the bk and h are 
distributed as a piecewise Gaussian mixture ifTSil . The real axis 

'For the sake of notation, in the following we drop the superscript 
(approx), i.e., Aj; = \(^pp^°^\ 



is partitioned into U intervals [ha^, ha^] for u — 1,2, ... ,U. 
For Afc e [ha^ , ha^] we have 

PA^\Bk,Hi>^k\bk,h) ^J2n ~ ^ 



-[ Gu V2TThjf,,u,k 



exp 



(4) 



where A^, Bk, and, H are the random variables corre- 
sponding to LLRs, bits, and channel gains respectively, and 
with Afc, bk, and h we denoted their realizations. Note 
that 7i,„,fc, . . . ,7G„,u,fc, and mi,„,fe, . . . , mG„,u,fc are the 
Gaussian mixture parameters of the u-ih interval, which 
are also functions of b^. In ifTSi expUcit expressions of 
PAk\Bki^k\bk) are derived for BPSK, QPSK, 16-QAM, and 
64-QAM constellations. In the following, we will also need 
the PAk\Bk.H{^k\bk, h) that can be obtained by averaging (01) 
over the channel PDF, i.e.. 



PAfc|i3fc(Afc|5fe) = EH[pA,\B^,Hi>^k\bk,h)] 



(5) 



where E//[-] denotes expectation with respect to the random 
variable H. Note that, in general there is no closed-form 
expression for (|5]l. In this paper we consider two channels: the 
AWGN channel, where pA^lSfc (A/c|6/c) = PA^lB^Mi^klbk, 1), 
and the Rayleigh fading channel. For the latter no closed-form 
solutions are available. Hence, numerical integration methods 
will be needed. 

III. LLR Quantization 

The LLRs associated to the same transmitted symbol are 
correlated random variables as they are affected by the same 
noise sample. Therefore, vector quantization |19| of the LLR 
vector Ai, A2, . . . , Aiog a/, can be applied. However, this tech- 
nique is exceedingly complex for large AI. 

Here, we propose instead that the LLR of each bit be 
quantized by a tailored quantizer. In fact, each of the log AI 
LLRs has a different statistic, as shown in (|4|i, and a great 
performance benefit can be achieved by considering logM 
quantizers, each with its own quantization intervals. As noted 
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above, the statistics of the LLR depend on the channel hi, 
therefore adapting the quantizer to the channel associated to 
the LLR would also increase the accuracy of the quantized 
representation. However, the decoder should then know also 
the channel gain, and additional memory should be reserved to 
store this information. In order to reduce memory occupation, 
we consider here a scenario where channel gains are discarded 
after the LLR computation, and the quantizers are not adapted 
to the channel levels. 

A. Quantization Procedure 

We focus on the uniform quantization of the LLRs, although 
the derivations are easily extended to non-uniform quantiza- 
tion. In particular, the LLR of the k-th bit is quantized by a 
uniform quantizer, having quantization step and Lk — 2"''' 
levels, where wi; is the number of bits used to describe a level. 
The Lk quantization intervals are 

T^e = [dk,e-i,dk,i) , (6) 
with £=1,2,..., Lfc, where d^fi = — oo, dk^Lt, = oo, and 
Lk 



Ik , 



1.2, 



1, 



(7) 



Note that, Wk and qk fully specify the quantizer for A^. The 
quantization process is described as follows: 



Afc is mapped to index = £ if A/j G 2?^ 



(8) 



For each index Vk, we have a corresponding quantized LLR 
value ^'l^k- discrete random variable Vk be the quan- 

tization level index of Afc and P\4|_Bfc(wfc|&fe) the conditional 
probability mass function (PMF) of Vk, given Bk, which can 
be written as 



Pv^\B^{vk\bk) 



PAfc|Bfc(Afc|&fc) dAfc . 



The unconditional PMD is given by 



(9) 



(10) 



The quantization of the LLR introduces a decoding loss. 

In general, numerical methods must be used to compute (|9|l. 
For AWGN channels with a given channel gain h and given 
noise power a^, from (|4|i we have a closed-form expression 
of the conditional PMD, i.e.. 



PVk\Bk{vk\hk)=2_^2^jr 

U—1 fl — 1 



ltL,u,k^h 



(11) 



where Q( ) is the tail probability of the standard normal 
distribution and 

a„ = min{max{dfc^^^ , ha^}, ha^} (12) 

(in =min{max{dfc^i,j^_i,/ia^},/iajf}. (13) 



B. Quantization Design 

We design the quantizers with the objective of maximizing 
the performance in terms of GMI. In the literature, the GMI is 
proposed as an accurate performance measure for BICM sys- 
tems with mismatched decoders fT2l . liT3l . llT4l . The decoder 
is mismatched for two reasons: (i) the LLRs coming from a 
given symbol are not independent as inherently assumed by 
the decoder; (ii) the decoder assumes unquantized LLRs. 

Let us consider a binary decoder that takes as input the 
vector of LLR 

A = [Ai_i, A2,l, . . . , Aiog A/4, Ai^2 • ■ • , Alog A/,7Vs] 

having size Nf, = log M ■ Ns- Let 

b = [6l,l, 62,1, • • ■ , blogMS,bl.2 ■ • • , blogM,Ns] 

be one of the possible codewords. The decoder maximizes the 
following metric: 

Ns logM 

J=l k=l 

Therefore the output of the decoder is 



b — argmax fi(r, b) 



(15) 



where C is the set of valid codewords. The GMI represents the 
maximum rate that can be achieved with vanishing word error 
probability (when the size of the word goes to infinity). 
In particular, an upper bound on the error probability is given 
by lEOl Ch. 5], 

IorM 



Prob(6 7^ 6) < 



1 



logAf 



E2- 

fe=i 



(16) 



where 



EiXk,R) 



max max|e(At, p, x) 

0<p<l x>0 



pR} , (17) 



is the random coding exponent, and (assuming equiprobable 
symbols) the generalized Gallager function is defined as 




e(Afc,p,a;) = - logE^^ 2^Ps,(6fc) 

'pBM+VB,{l-bk)e~^'''-'^>^-^ 



(18) 

From (fTSI l we note that the word error probability goes to 
zero as Nh goes to infinity if the random coding exponent is 
strictly positive, and in this case the rate R is achievable. The 
maximum achievable rate is given by the GMI. In |T3^| it has 
been proved that the the GMI can be written as 



log M 

max y 

fc=i 



GMI = max > BGMIfc(x) , 
where the binary GMI (BGMI) is 

BGMIfe(a;) - - / ^ pb, (bk) log 

'pBjbk)+PB,a~bk)e~^^^-''^''''^' 



(19) 



(20) 
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Considering the quantization rule in (|8]l and equiprobable bits, 
we can rewrite ( |20] i as 

BGMIfe(x)=l- ^ - pv-.|B.K|0)log 1 + 6^"..'=^ + 



+Pv^|i3jwfe|l)log ( 1 + e 



(Q) 



(21) 



We first note that the quantized LLR value that maximizes the 
GMI can be obtained by setting to zero the derivative of the 
BGMI with respect to Xk{vk)- Doing so yields 



A 



(Q) _ 1 ^^f Pv.lBMl) 



(22) 



Inserting ( |22] | into ( |20| ) we obtain 

BGMh^ I{Bk;Vk) 

1 V" V" / 11, M PVk\BA'"k\bk) 

i;fc = lbfc=0 ^^i-^ 

(23) 

which coincides with the MI between bk and Vk, and does not 
depend on x. Substituting ( |23] | in (fT9l l yields 

log A/ 

GMI= £ HBk^Vk) = 

k=l 

;M Lfc 1 



^ £ £ £ PV.|i3,(wfc|&fc)log 



fe=l "fc = l bfc=0 



Pt4|J3,.(^fc|^fc) 



(24) 



In other words, the maximum rate achievable by a decoder 
using the metric ( fT4b with quantized LLR, is given by the 
sum over k of the mutual information between Bk and Vk- 
For the design of both the quantizer and the compressor, we 
assume that (l22T l holds true. As mentioned, we are using a 
specific quantizer for each of the log A/ bits mapped to a 
symbol. Therefore the objective of the quantization design is to 
optimize the vector w = {wi,W2, • ■ • , wiogj\/) and the vector 
Q = {qi,q2,-- ■,q\ogM), where Wk and qk are the bit-length 
and quantization steps of the quantizer that operates on the 
LLR of the fc-th bit. The quantizer design aims at maximizing 
the GMI in (l24l l with the constraint to use W bits for the 
quantization of all LLRs of a word. Mathematically, we aim 
at solving 



s.t. 



log A/ 

max I{Bk; Vk) , 

9i,---,giogA/ ^-^ 

Wi,...,Wiag M k=l 



\ogM 
k=l 



(25a) 



(25b) 



Unfortunately, the constrained maximization ( l25T l is a mixed 
integer programming (MIP) problem and cannot be solved in 
closed form. We must resort to numerical methods to optimize 
both q and w. 



Optimization of the quantization steps qk-' For each k = 
1,2,..., log M, and Wk = 1,2,..., W, we first find the best 
qk that maximizes the BGMI for each k and for each Wk, i.e.. 



Qkiwk) = argmax/(Bfe; Vk) . 

9fc 



(26) 



The above optimization can be performed numerically substi- 
tuting (doll and ^ in Let 4,^, = I{Bk;Vk{qk{wk))) 
be the mutual information between Bk and Vk using Wk 
bits and the quantization step qk{wk) obtained in (l26T l. As 
mentioned, considering the Gray mapping, we can treat inde- 
pendently the real and the imaginary parts of the constellation 
points. We map the bits hk on the imaginary axis when k is 
odd. Similarly, we map the bits bk on the real axis when k is 
even. The symmetry introduced by the Gray mapping implies 
92«-i = q2u, with w = 1, 2, . . . , In Tables H] and|IIl we 

report the results of for 4096-QAM and both AWGN 
and Rayleigh fading channel. 

Note that when Wk = 1 we are considering a hard decision 
on the LLR, and the BGMI does not depend on qk- 

Optimization of the bit lengths Wk-' After having optimized 
the quantization step qk for each k and each Wk, our focus is 
to find the best w subject to ( I25bb . Therefore, the optimization 
objective (l25T l becomes 



log M 

max V I{Bk; Vk) , s.t. (l25b] l 

7/h„„ n I ' ^ 



(27) 



k=\ 



Our approach to solving (l27T i is to assign one bit at a 
time to the fc*-th quantizer that yields the highest gain in 
terms of MI, so that k* — argmaxfc{Jfc^u,j._)_i — lk,wk]- 
Therefore, after having computed qk{wk) and Ik.wk for ^ach 
k — 1,2,..., log M and Wk = 1,2,..., W, the optimization 
(|25] | is solved as follows: 

1) Initialize w ^ (0, . . . ,0). 

2) For W iterations do: 

a) Find k* = argmax{/fe^^^^+i - Ik,wJ- 

k 

b) Set Wk' = Wk' + 1. 

We find it interesting that this greedy procedure is optimal if 
Ik.Wk is an upper convex sequence of WkE 

As proved in the 

Appendix, the greedy procedure provides the same result as 
an exhaustive search if the MI gain obtained by using Wk + 1 
bit instead of Wk decreases as Wk increases. 

Tables |III] and |IV] show the results of this optimization for 
our study case with 4 096-QAM, respectively, for AWGN and 
Rayleigh fading channel. Also in this case, the symmetry 
introduced by Gray mapping implies W2u-i = W2u, with 
w = l,2,.. 



IV. LLR Compression 

The second part of this paper is based on the observation 
that the quantized LLR levels are not uniformly distributed, 
therefore compression can reduce the memory needed to store 
the LLRs. Let v = (wi, W2, • • ■ , wiog a/) be a vector of the 
LLR quantized levels coming from the same received symbol. 

''Although, we could not prove the upper convexity of Ik,wf, under general 
conditions, as remarked in Section [V] this property holds true in all the cases 
considered in this paper, for both AWGN and fading channels. 



6 



In order to allow the symbol de-interleaver to move blocks of 
the same size, the compression procedure must represent each 
V with the same number of bits. Then, our task is to design a 
procedure that maps the W bits representing v into N bits. 

With this purpose, we propose to perform a lossy compres- 
sion in two steps: First we do a lossless entropy coding applied 
separately on each Vk ; then, if the number of bits exceed N we 
perform a further LLR compression described in following. 

For the lossless compression, we apply Huffman coding [21] 
at the output of each LLR quantizer. Let mk^vk be the length 
of the Huffman codeword that represents the level v^. Then 
the number of bits required to represent v is 



log A/ 

fc=l 



(28) 



If N < N, no further compression is needed. The vector v 
can be stored as it is or potentially padded with zeros to make 
it of length N. Otherwise, we modify one or more quantizer 
outputs so that the new N is smaller or equal than the target 
N. Clearly this operation will cause a performance loss that 
we can quantify in terms of GML Our aim is to minimize this 
loss while reaching the target length N. 

Let 6k.a.b be the average GMI loss incurred when we replace 
the LLR quantized level Vk = a with another level, Vk = b. 
Note that, by replacing Vk — a with iik = b, we obtain the 
new PMFs 



Pv,\B,Mbk) 



' PV^\BA^k\bk) 



PV^\BMbk) + 
.PV^\Bjb\bk) 



Vk^a.b 
Vk = a 

Vk^b. 



(29) 




Vk ^ a,b 
Vk=a 

Vk = b. 



(30) 



Therefore, considering ( |29] l, ( |30] | and ( l23T l. the average GMI 
loss, SkM.b is given by 



Sk.a.b =PVfc|Bja|6fe) Id; 



Pv,\BAa\bk) 



■Pv,\BAb\bk) log 



pv, (a) 
Pv,\B,ib\^k) 
Pv, (b) 



log 



{Pv^lB^albk) + Pv^iBA^bk)) 

bk=0 

Pv^lB^albk) +Pv^\Bjb\bk) 



(31) 



pv, (a) + Pv^ (b) 



Note that (5fc,a,6 is zero if a — b, otherwise is non-negative. 

In order to reach the compression target N, one or more 
LLR quantized levels Vk, will be replaced with a new levels 
Vk, with a shorter representation. The problem is to find the 
vector i) = {vi,V2, ■ ■ ■ ,viogM) that minimizes the average 
GMI loss, while keeping the N < N. Mathematically we aim 



at solving 



mm 

Vl,...,Vlog M 



log M 

fc=l 



S.t. 



log M 

mk,v^ < N , 



(32a) 



(32b) 



fe=i 



This problem can then be seen as a multidimensional multiple- 
choice knapsack (MMCK) problem [15 1, Unfortunately, the 
MMCK problem is NP hard ifTsl . thus we resort to a greedy 
iterative approach that follows. 

Greedy LLR compression: Starting from v, at each iteration, 
the algorithm selects the substitution Vk Vk yielding the 
smallest average GMI loss, considering only the Vk such that 
™fc,fifc < fnk.vk- The length N is decreased at least by 1 at 
each iteration. We stop the procedure when N < N. The 
iterative procedure works as follows: 

1) Initialize vi = Vi,V2 — V2, ■ 

2) Stop if (I32bl l is satisfied. 

3) For each k = l,...,logM, and -0^ 



= l,...,Lk, if 



rrik^i,'^ > rukfik then set 4 
4) Find 

{kM'kl) 



k-Vk-v'f. 



fc— l,...,log A-/ 
vi = l,...,Lk 



(33) 



5) Set Ufc. = v[l . Goto 2) 

This greedy procedure, which is suboptimal in general, is 
optimal if it converges in one iteration. We have two bounds 
on the number of iterations required for the convergence. On 
one hand, as at each iteration we set at least one value of 

5k,vk,vk to oo we have 



log j\/ 



# iterations < ^ Lk 



(34) 



k=l 



On the other hand, as N is decreased by at least one bit at 
each iteration, we have 



# iterations < N - N . 



(35) 



and usually this second condition provides the tightest bound. 

Joint Optimization of W and N: In the previous section we 
have provided a detailed design of the LLR quantization and 
compression. Following the proposed scheme, the only two 
parameters we need to set in order to specify the quantization 
and compression procedure are W and N, which represent 
the number of bit at the output of the quantizer and of the 
compressor respectively. Only N determines the final size 
of the memory, but both of them have an effect on the 
performance. In fact if W is much higher than N, we will 
have higher GMI at the output of the quantizer, but the lossy 
compression will be aggressive and will introduce significant 
loss. We do not know an easy way to determine the best W 
for a given N. In the numerical results reported in Figures |8] 
and |9] we tested several values of W for each N and chose 
the one that gives the best performance. 
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Fig. 3. BGMI of the quantized LLR of the MSB of a 64-QAM constellation 
over AWGN channel with C/N = 10 dB, as a function of the quantization 
step qi, for several values of toi. Lines represent analytical results and 
markers represent Monte Carlo simulations. 



Fig. 4. BGMI of the quantized LLR of the LSB of a 64-QAM constellation 
over AWGN channel with C/N = 10 dB, as a function of the quantization 
step 55, for several values of 105. Lines represent analytical results and 
markers represent Monte Carlo simulations. 



V. Numerical Results 

We evaluate the performance of the proposed solutions 
on the DVB-C2 standard for cable television. This standard 
provides OFDM with 4096 subcarriers, BICM with LDPC 
codes and symbol interleaver (a combination of frequency 
and time interleaving), which fits the scheme of Figure [T] 
In particular, the symbol interleaver is a row-column block 
interleaver, with a number of rows up to 16 OFDM blocks and 
with a number of columns up to 3 236 (corresponding to the 
maximum number of data symbols in a OFDM block). Various 
constellation sizes are provided, from 16-QAM up to 4096- 
QAM with Gray mapping. Hence, in the worst case scenario, 
the interleaving block contains 51776 data cells or 621312 
LLR values. In the following, we will refer to the carrier to 
noise (C/N) ratio as the SNR on each subcarrier after OFDM 
demodulation. 

A. Quantization Performance 

Figures|3]and|4]show the BGMI obtained from the quantized 
LLRs as a function of both qk and Wk, for a C/N ratio of 10 dB, 
which represents the working point for the 64-QAM. Results 
are reported for both the least significant bit (LSB) and the 
most significant bit (MSB) along the real axis of 64-QAM 
symbols, i.e., for fc = 1 and fc = 5, respectively. Lines are 
obtained using the closed form expression of the PDF of the 
quantized LLRs, and markers show results obtained by Monte 
Carlo simulations. We see perfect overlap between analytical 
and simulation results. 

First, we note that, for each value of wt we have only one 
optimum value of the quantization step qk that maximizes the 
BGMI. Then, we observe that both the maximum BGMI and 
the corresponding values of q^ are different for the LSB and 
MSB. The same holds also for the other data bits (results 
are not reported here), with a behaviour similar to that of 
Figures [3] and |4] This justifies the use of different quantization 
steps for each bit of the constellation. We note also that as 



TABLE I 

Best g^. which maximizes the MI for each k and considering 
A4 096-QAM 



""fc 


1,2 


3,4 


5,6 


k 

7,8 


9, 10 


11, 12 


2 


3.73 


3.40 


3.13 


2.93 


2.53 


1.80 


3 


2.23 


2.00 


1.83 


1.77 


1.46 


1.03 


4 


1.21 


1.12 


1.05 


0.97 


0.84 


0.55 


5 


0.75 


0.66 


0.61 


0.52 


0.47 


0.28 


6 


0.38 


0.36 


0.34 


0.32 


0.26 


0.14 



the number of bits Wk increases, the maximum BGMI gets 
closer to the BGMI obtained with unquantized LLR, and the 
gain obtained using Wk + 1 bit instead Wk gets smaller Also, 
for large quantization steps, the number of bits Wk does not 
affect the BGMI performance, because adding bits provides 
quantization intervals for large values of LLR that do not 
contribute significantly to the BGMI. 

We then consider larger constellations, in particular the 
4096-QAM constellation used in DVB-C2, which represents 
the worst-case scenario for the symbol interleaver memory 
size. The following results were obtained by considering C/N 
= 32.2 dB for AWGN and C/N = 34 dB for Rayleigh fading, 
because, according to fT, Table 20, p. 128], it represents the 
lowest working points for the 4 096-QAM. In Tables H] and [III 
we report the optimal qk, solving ( l26l l, for Wk — 2, 3, . . . , 6, 
and for each LLR position of the 4096-QAM constellation, 

TABLE II 

Best g^. which maximizes the MI for each k and uij. considering 
A 4 096-QAM and Block Rayleigh Fading 





1,2 


3,4 


5,6 


k 

7,8 


9, 10 


11, 12 


2 


2.60 


2.40 


2.27 


2.07 


1.73 


1.53 


3 


1.29 


1.26 


1.20 


1.11 


1.00 


0.86 


4 


0.79 


0.72 


0.65 


0.64 


0.59 


0.52 


5 


0.39 


0.43 


0.36 


0.34 


0.34 


0.33 


6 


0.21 


0.25 


0.23 


0.21 


0.21 


0.19 
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TABLE III 

Optimal bit distribution sets for 4096-QAM. 





Wl,W2 
W3,'W4 

_«)ll,tOl2 




1 
1 
1 
1 

2 
1 


1 
1 
1 
1 
2 
2 


1 
1 
1 
2 
2 
2 


1 
1 
2 
2 
2 
2 


1 
2 
2 
2 
2 
2 


1 

2 
2 
2 
3 
2 


1 

2 
2 
2 

3 
3 


2 
2 
2 
2 
3 
3 


2 
2 
2 

3 
3 
3 


2 
2 

3 
3 
3 
3 


2 
2 
3 
3 
4 
3 


2 
3 
3 
3 
4 
3 


2 
3 
3 
3 
4 
4 


2 
3 
3 
4 
4 
4 


3 
3 
3 
4 
4 
4 


3 
3 
4 
4 
4 
4 


3 
3 
4 
4 
5 
4 


3 
4 
4 
4 
5 
4 


3 
4 
4 
4 
5 
5 


w 


12 


14 


16 


18 


20 


22 


24 


26 


28 


30 


32 


34 


36 


38 


40 


42 


44 


46 


48 


50 


TABLE IV 

Optimal bit distribution sets for 4096-QAM, Rayleigh Fading. 


W= • 


«)1,«)2 
W3, W4 

W7, WS 
Wg,Wl0 
,«>11,«)12 




1 
1 
1 
1 
2 
1 


I 
1 
1 
1 
2 
2 


1 
1 
1 
2 
2 
2 


1 
1 
2 
2 
2 
2 


1 
2 
2 
2 
2 
2 


1 
2 
2 
2 
2 
3 


2 
2 
2 
2 
2 
3 


2 
2 
2 
2 

3 
3 


2 
2 
2 
3 
3 
3 


2 
2 

3 
3 
3 
3 


2 
2 
3 
3 
3 
4 


2 
2 
3 
3 
4 
4 


2 
3 
3 
3 
4 
4 


2 
3 
3 
4 
4 
4 


2 
3 
4 
4 
4 
4 


3 
3 
4 
4 
4 
4 


3 
3 
4 
4 
5 
4 


3 
3 
4 
4 
5 
5 


3 
4 
4 
4 
5 
5 


W 


12 


14 


16 


18 


20 


22 


24 


26 


28 


30 


32 


34 


36 


38 


40 


42 


44 


46 


48 


50 



k, respectively in AWGN, and Rayleigh fading conditions. 
In Figure |5] the maximized BGMI for each bit and for each 
value of Wk are shown. Again, we observe that the BGMI 
is significantly different for each bit of the constellation and 
also that the gain achieved by adding quantization levels is 
different for each bit. For example, going from iffc = 1 to 
Wfe = 6 for the MSB provides an increase of BGMI of about 
0.025 bit/s/Hz, while for the LSB we have a BGMI gain of 
0.12 bit/s/Hz. This justifies the use of a different number of 
quantization levels for each data bit and therefore problem 
dZTl l. Furthermore, as also noted in Figures [3] and |4] the 
increase of BGMI obtained by using Wk + I bit instead Wk 
decreases as Wk increase. This guarantees that the proposed 
algorithm for solving jTTl returns the same results of an 
exhaustive search. Lastly, in Tables |III] and |IV] we report the 
results of the w optimization, showing the optimal distribution 
of bits Wk solving ( |25] ), for both AWGN and Rayleigh fading 
channel. As expected, we observe that a finer quantization (i.e., 
higher Wk) of the LLR associated with LSB bits, which are 
less protected by the Grey mapping, pays off. 



3 




0.6 




k 


= 1 




= 3 




= 5 




= 7 




= 9 




= 11 


Wk 


= oo 



Wk 



Fig. 5. BGMI of the quantized LLR for different values Wh and using 
optimal quantization step qi^ . Considering 4 096-QAM over AWGN channel 
with C/N = 32.2 dB. 



B. Quantization and Compression Performance 

We now evaluate the effect of the LLR quantization and 
compression in terms of SNR gap, i.e., the amount of ad- 
ditional transmit power (or noise power reduction) required 
when quantization is used in order to achieve the same GMI 
of a receiver operating without quantization. 

Figure |6] shows the complementary cumulative distribution 
function (CCDF) of the encoded word length N for different 
values of W . We observe that the Huffman coding provides a 
significant reduction of the number of required bits to describe 
the quantized LLR. For example, for W — 72, in 90% of the 
realizations N < 47, with a compression of about 50%. For 
W ^ 60 the probability of having iV > 50 is less than 0.001. 

Hereafter, we show the GMI performance of the optimized 
quantization as a function of the C/N. First we note that the 
optimal quantization step depends on both the C/N itself and 
the channel conditions. Usually, the performance of DVB-C2 
is assessed by providing the minimum C/N at which a given 
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Fig. 7. GMI as a function of the C/N using A'^ = 34 and different values 
of W. Considering 4096-QAM over AWGN channel. 
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Fig. 8. SNR gap for quantized and quantized and compressed LLR as a 
function of N, for 4096-QAM at C/N = 32.2 dB over AWGN channel 
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Fig. 9. SNR gap for quantized and quantized and compressed LLR as a 
function of N, for 4096-QAM at C/N = 34 dB, in block Rayleigh fading 
channel 



BER is achieved. In terms of GMI, we can compare different 
solutions by considering the minimum C/N at which a given 
GMI is achieved. In practice, we can optimize both q and w 
considering the lowest C/N at which a target GMI is achieved 
as higher C/N values will not decrease the GMI. 

Figure|7]shows the GMI as a function of the C/N for various 
values of W, but with the same value of iV = 34 bits, hence 
at the parity of interleaver memory size. We observe that by 
using the iterative compressing procedure of Section lTVl we do 
not incur any significant loss in terms of GMI. In our example, 
the outputs of the optimized quantization using W = 38, and 
W — AQ bits, respectively, have been compressed to iV = 34 
bits, thus outperforming the case of a sheer quantization using 
= 34 bits. Figure [7] shows also that when the number 
of bits used to represent quantization levels is too high with 
respect to the constraint length (for instance W — 60, and 
N = 34) the procedure introduces a significant loss. Another 
comparison between the system with quantization (QUANT) 
and the system with quantization and compression (QUANT 
-I- COMP) is provided in Figure [8] where the SNR gap is 
reported as a function of the total number of compressed bits 
N, then as a function of the required memory. The dotted 
lines represent the SNR gap in case of QUANTh-COMP for 
different values of W. In other words, each of these lines 
represents the performance of the optimized Quantizers using 
W bits, where the output is then compressed from W to 
N bits. We note that for any of these curves the SNR gap 
decreases as N increases, because the loss due to compression 
is reduced, until N = W, when compression has no effect and 
the SNR gap flattens. The line with star markers shows the 
minimum SNR gap achievable by QUANTh-COMP approach. 
This result is obtained by choosing the W that reaches the 
minimum SNR gap, for each values of N. The QUANT case 
performance are shown with gray circle markers, in this case, 
as there is no compression we consider N = W. Finally, 
square black markers show the performance on an unoptimized 
system (UNOPT), where the same quantizer is used for all data 
bits of the constellation. In this case, as Wk is constant for all 
k, W can be only a multiple of log M. 

We observe that both quantizer optimization and compres- 
sion provide a significant reduction of the SNR gap with 
respect to a traditional unoptimized system. As shown in 
Figure [8] the optimized quantization, QUANT, outperforms the 
unoptimized quantization, UNOPT, with a gain of 0.8 dB and 
0.3 dB, for N = 24, and TV = 36 respectively. Interestingly, 
the use of compression yields an advantage for only large 
values of N. For example, if we target a SNR gap of 0.2 dB 
we need iV = 29 bit with QUANTh-COMP, whereas we need 
iV = 36 bit with QUANT, with a reduction of memory of 
about 20%. 

Note that, the use of compression yields advantages only if 
the loss target is small enough. For example, if we target a 
SNR gap lai-ger than 0.7 dB, the (QUANT + COMP) approach 
does not bring any gain with respect the QUANT approach. 
In other words, it is not efficient to compress LLRs that are 
already quantized optimally by using a limited number of bits. 

Figure |9] shows the comparison between QUANT, QUANT 
+ COMP, and UNOPT in the case of a block Rayleigh fading 



10 



channel. Here, the SNR gap is computed at C/N = 34 dB 
(differently from AWGN), because the C/N working point in 
this case is higher. Also in this case if the target SNR gap 
is 0.2 dB, we need N = 29 bit with QUANT+COMP, and 
TV = 34 bit with QUANT. Therefore the memory reduction 
is about 15%. The performance gap between the optimized 
and the unoptimized quantization is even more significant in 
the case of a block Rayleigh fading channel. In fact, QUANT 
shows a SNR gain of 1.1 dB and 0.6 dB in the case of iV = 24 
and N — 36, respectively. 

C. Memory Comparison 

We now compare the conventional scheme (CONV) il- 
lustrated in Figure |2^, and the proposed scheme (PROP) 
illustrated in Figure |2j), in terms of required memory. We 
assume that all interleavers are designed such that they can 
be written and read simultaneously, so that there is no need to 
double the size of the memory to allow for pipelining. This 
is a common feature in today's communication systems, as is 
the case of the DVB-C2 system. 

In CONV, for each data cell, the received complex symbol, 
ri and the channel estimate, hi, have to be stored in the 
memory Mgj-,. In order to save memory, the receiver can 
compensate the phase rotation due to the channel after its 
estimation and then simply store the magnitude of the channel 
estimates. Therefore the size of the memory Mg^ is 



(36) 



where Ns is the number of data cells to be interleaved, Bs 
is the number of bits per axis to represent 7^, and Bh is 
the number of bit to represent hi. Whereas, in PROP, the 
compressed LLRs associated to one data cell occupies at most 
N bit, then the size of memory Afgo is 



NgN . 



(37) 



The memory for the bit interleaver i\/BD in both schemes is 

NbW 



E(AfBD) 



logAf 



(38) 



where Nb is the depth of the bit interleaver Note that here the 
compressing procedure is not applicable because the LLRs are 
moved one by one by the bit interleaver, therefore each LLR 
Vk will be represented by Wk uncompressed bits. For DVB- 
C2, the maximum value of Nb is 64 800, and for the symbol 
interleaver Ns is at most 51776. Therefore in DVB-C2 the 
memory required by the A/sd overrides the memory required 
by the A/bd- In all the following assessments, we will consider 
the worst case, 4096-QAM, which maximizes the size of 
A/gp. DVB-C2 performance assessments show that in order to 
to have a SNR gap smaller than 0.1 dB, we have to use at least 
Bs = 15, Bh = 14 bit to represent and hi respectively, 
and Wk = 5 bit for each LLR. Thus the total required memory 
size, i;(AfTot) = S(A//sD) + S(AfBD), is around 2.6 Mbit. On 
contrary, in the proposed scheme we are able to reach the same 
target using compressing procedure with parameters W — 42, 
and N = 35. The total memory size becomes 2.03 Mbit, thus 
enabling a 21.6% memory saving. Note that in PROP, we can 



represent and hi by using as much precision as needed to 
have a negligible loss. The values of Bs and B^ will have no 
effect on the interleaver memory size. If the target on the SNR 
gap is more relaxed, for instance 0.2 dB, the saved memory 
becomes even larger In fact, in CONV, to obtain a SNR gap 
smaller than 0.2 dB, we need to consider Bs = 14, Bh ~ 13, 
and Wk = 5, thus the total required memory is around 2.44 
Mbit. Whereas, in PROP, the target is achieved using W — 36, 
and iV = 29 compressed bit for data cell, and then requiring 
about 1.69 Mbit. Therefore achieving a memory reduction of 
more than 30%. It is interesting to note that, also in case of 
no compression, the total memory size is reduced by at least 
15% with respect to the conventional receiver. The required 
memory size and the potential memory saving are summarized 
in Table |V] Finally, we can conclude that by using PROP it is 
possible to reduce at least by 15% the memory at the receiver, 
and if the receiver can support the additional complexity due 
to the compressor, the saved memory goes up to 30%. 

VI. Conclusions 

In this paper we have proposed a new technique for the 
compression of LLR in a communication system that uses 
interleavers with long block sizes. With criterion being the 
maximization of the GMI between the transmitted bits and the 
quantized and compressed LLR, we have shown a quantizer 
optimization method and a suitable compression technique. 
Results show that by using the proposed architecture, it is 
possible to save 15% of memory without introducing any fur- 
ther significant complexity at the receiver, simply employing 
the proposed optimized quantizer. Furthermore, at the expense 
of the additional complexity due to the proposed compressing 
procedure the saved memory can rise up to 30%. 

Appendix 

In the following, we will report the proof of the optimality 
of the greedy procedure, in case of upper convexity of Ik.wk- 
Proof: Let 5i,j — lij ~ lij^i be the elements of a matrix 
A = {Sij} having dimension log A/ x W. Since, < Sij < 
<^i.j-i each row of A is a non-increasingly sorted vector. 
We can rewrite the optimization (|27] | as follows. 



max Sij s.t. (I25bl l. 

{w} — ^ — ^ 



(39) 



1=1 j=i 



Clearly the optimization objective is maximized when the 
largest W elements of matrix A are summed. Let S[(] be the 
£-th largest element of A, then we can write the maximized 
optimization objective as 



w 



(40) 



Assuming that w = {wi, . . . ,w\ogM,) is the vector that 
maximizes ( |27] | using W bits, we can write the {W + l)-th 
largest element of A as 



(5r,T, , = max < max( 
i'^'^ + ' I j>wi 



(41) 
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TABLE V 
Memory Comparison. 



Loss 
Target 


Receiver 
Scheme 


Bs 


Bh 


W 


N 


S(Msd) 
[Mbit] 


S(Mbd) 
[Mbit] 


S(MTot) 

[Mbit] 


Saved 
Memory 


0.1 dB 


CONV 


15 


14 


60 




2.27 


0.32 


2.60 




PROP 






42 


35 


1.81 


0.22 


2.03 


20.6 % 




CONV 


14 


13 


60 




2.12 


0.32 


2.44 




0.2 dB 


PROP 






36 


29 


1 .50 


0.19 


1.69 


30.6 % 




PROP w/o COM? 






36 




1.86 


0.22 


2.06 


15.6 % 



Since Sij < 5ij-i, it becomes 

^[w+i] = max(5i,is,+i . (42) 

That is precisely the rule used in our procedure. Therefore the 
proposed procedure will distribute the remaining W bits in an 
optimal way, i.e., returning the same result of an exhaustive 
search. ■ 
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